Search CORE

49 research outputs found

Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents

Author: BA Beckwith
Brett R South
D Gupta
E Aramaki
F Jeffrey Friedlin
FJ Friedlin
G Szarvas
H Dalianis
I Neamatullah
J Aberdeen
J Gardner
JJ Berman
K Hara
Matthew H Samore
O Uzuner
O Uzuner
Oscar Ferrández
P Ohm
R Grishman
Shuying Shen
SM Meystre
SM Meystre
Stéphane M Meystre
Y Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Relaxation Height in Energy Landscapes: an Application to Multiple Metastable States

Author: A. Bovier
A. Bovier
A. Bovier
A. Bovier
D. Mehta
D.J. Wales
E. Olivieri
E. Olivieri
E.N.M. Cirillo
E.N.M. Cirillo
E.N.M. Cirillo
E.N.M. Cirillo
E.N.M. Cirillo
E.N.M. Cirillo
Emilio N. M. Cirillo
F. Hollander den
F. Hollander den
F. Hollander den
F. Manzo
F. Manzo
F.R. Nardi
Francesca R. Nardi
G. Grinstein
H.W. Capel
J. Beltrán
L. Alonso
M. Blume
M. Blume
M. Bousquet-Mélou
M.I. Friedlin
R.J. Glauber
S. Bigelis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The study of systems with multiple (not necessarily degenerate) metastable states presents subtle difficulties from the mathematical point of view related to the variational problem that has to be solved in these cases. We introduce the notion of relaxation height in a general energy landscape and we prove sufficient conditions which are valid even in presence of multiple metastable states. We show how these results can be used to approach the problem of multiple metastable states via the use of the modern theories of metastability. We finally apply these general results to the Blume--Capel model for a particular choice of the parameters ensuring the existence of two multiple, and not degenerate in energy, metastable states

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Fluctuations in Nonequilibrium Statistical Mechanics: Models, Mathematical Theory, Physical Mechanisms

The fluctuations in nonequilibrium systems are under intense theoretical and experimental investigation. Topical ``fluctuation relations'' describe symmetries of the statistical properties of certain observables, in a variety of models and phenomena. They have been derived in deterministic and, later, in stochastic frameworks. Other results first obtained for stochastic processes, and later considered in deterministic dynamics, describe the temporal evolution of fluctuations. The field has grown beyond expectation: research works and different perspectives are proposed at an ever faster pace. Indeed, understanding fluctuations is important for the emerging theory of nonequilibrium phenomena, as well as for applications, such as those of nanotechnological and biophysical interest. However, the links among the different approaches and the limitations of these approaches are not fully understood. We focus on these issues, providing: a) analysis of the theoretical models; b) discussion of the rigorous mathematical results; c) identification of the physical mechanisms underlying the validity of the theoretical predictions, for a wide range of phenomena.Comment: 44 pages, 2 figures. To appear in Nonlinearity (2007

arXiv.org e-Print Archive

CiteSeerX

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Automatic de-identification of textual documents in the electronic health record: a review of recent research

Author: B Wellner
BA Beckwith
Brett R South
C Friedman
D Gupta
DA Dorr
E Aramaki
EM Fielstein
F Jeffrey Friedlin
FJ Friedlin
FP Morrison
G Szarvas
G Szarvas
GPO U.S
GPO U.S
H Cunningham
I Neamatullah
J Gardner
JJ Berman
K Atkinson
K Hara
L Sweeney
Matthew H Samore
NCI
NLM
NLM
NLM
O Uzuner
O Uzuner
O Uzuner
O Uzuner
P Ruch
RK Taira
Shuying Shen
SM Meystre
SM Thomas
SM Thomas
Stephane M Meystre
Y Guo
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here. Methods This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers. Results The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries. Conclusions In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.</p

Crossref

IUPUIScholarWorks

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Analysis and design of randomised clinical trials involving competing risks endpoints

Author: A Latouche
A Latouche
A Latouche
AWM Lee
B Friedlin
BC Tai
BC Tai
BC Tai
BC Tai
Bee-Choo Tai
BR Logan
D Machin
D Machin
DA Schoenfeld
David Machin
G Schulgen
HT Kim
J Beyersmann
J Beyersmann
J Wee
JD Kalbfleisch
JJ Dignam
JJ Gaynor
Joseph Wee
JP Fine
M Pintilie
PA Poole-Wilson
PR Williamson
R Bajorunaite
R Gray
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background In randomised clinical trials involving time-to-event outcomes, the failures concerned may be events of an entirely different nature and as such define a classical competing risks framework. In designing and analysing clinical trials involving such endpoints, it is important to account for the competing events, and evaluate how each contributes to the overall failure. An appropriate choice of statistical model is important for adequate determination of sample size. Methods We describe how competing events may be summarised in such trials using cumulative incidence functions and Gray's test. The statistical modelling of competing events using proportional cause-specific and subdistribution hazard functions, and the corresponding procedures for sample size estimation are outlined. These are illustrated using data from a randomised clinical trial (SQNP01) of patients with advanced (non-metastatic) nasopharyngeal cancer. Results In this trial, treatment has no effect on the competing event of loco-regional recurrence. Thus the effects of treatment on the hazard of distant metastasis were similar via both the cause-specific (unadjusted <it>csHR </it>= 0.43, 95% CI 0.25 - 0.72) and subdistribution (unadjusted <it>subHR </it>0.43; 95% CI 0.25 - 0.76) hazard analyses, in favour of concurrent chemo-radiotherapy followed by adjuvant chemotherapy. Adjusting for nodal status and tumour size did not alter the results. The results of the logrank test (<it>p </it>= 0.002) comparing the cause-specific hazards and the Gray's test (<it>p </it>= 0.003) comparing the cumulative incidences also led to the same conclusion. However, the subdistribution hazard analysis requires many more subjects than the cause-specific hazard analysis to detect the same magnitude of effect. Conclusions The cause-specific hazard analysis is appropriate for analysing competing risks outcomes when treatment has no effect on the cause-specific hazard of the competing event. It requires fewer subjects than the subdistribution hazard analysis for a similar effect size. However, if the main and competing events are influenced in opposing directions by an intervention, a subdistribution hazard analysis may be warranted.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Leicester Research Archive

Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections

The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI).A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis.An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52-68% and retained sensitivities of 69-73%.Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central